现有的可解释人工智能(XAI)算法的界限仅限于技术用户对解释性的需求所基于的问题。这项研究范式不成比例地忽略了XAI的非技术最终用户的较大群体,他们没有技术知识,但需要在其AI-ASS辅助批判性决定中进行解释。缺乏以解释性为重点的功能支持可能会阻碍在医疗保健,刑事司法,金融和自动驾驶系统等高风险领域中对AI的安全和负责任的使用。在这项工作中,我们探讨了如何设计为最终用户的关键任务量身定制的XAI如何激发新技术问题的框架。为了引起用户对XAI算法的解释和要求,我们首先将八个解释表格确定为AI研究人员和最终用户之间的通信工具,例如使用功能,示例或规则来解释。然后,我们在实现不同的解释目标(例如验证AI决策并改善用户的预测结果)的背景下,使用32名外行参与者进行用户研究。基于用户研究结果,我们确定并提出新颖的XAI技术问题,并根据用户的解释目标提出评估度量验证能力。我们的工作表明,在最终用户使用XAI中解决技术问题可以激发新的研究问题。这样的最终用户启发的研究问题有可能通过使人工智能民主化并确保在关键领域中对AI负责使用,从而促进社会利益。
translated by 谷歌翻译
One of the key challenges in deploying RL to real-world applications is to adapt to variations of unknown environment contexts, such as changing terrains in robotic tasks and fluctuated bandwidth in congestion control. Existing works on adaptation to unknown environment contexts either assume the contexts are the same for the whole episode or assume the context variables are Markovian. However, in many real-world applications, the environment context usually stays stable for a stochastic period and then changes in an abrupt and unpredictable manner within an episode, resulting in a segment structure, which existing works fail to address. To leverage the segment structure of piecewise stable context in real-world applications, in this paper, we propose a \textit{\textbf{Se}gmented \textbf{C}ontext \textbf{B}elief \textbf{A}ugmented \textbf{D}eep~(SeCBAD)} RL method. Our method can jointly infer the belief distribution over latent context with the posterior over segment length and perform more accurate belief context inference with observed data within the current context segment. The inferred belief context can be leveraged to augment the state, leading to a policy that can adapt to abrupt variations in context. We demonstrate empirically that SeCBAD can infer context segment length accurately and outperform existing methods on a toy grid world environment and Mujuco tasks with piecewise-stable context.
translated by 谷歌翻译
Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: download a copy of a foundation model, and fine-tune it using some in-house data about the target task of interest. Consequently, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks. Yet, these individual fine-tunings often lack strong generalization and exist in isolation without benefiting from each other. In our opinion, this is a missed opportunity, as these specialized models contain diverse features. Based on this insight, we propose model recycling, a simple strategy that leverages multiple fine-tunings of the same foundation model on diverse auxiliary tasks, and repurposes them as rich and diverse initializations for the target task. Specifically, model recycling fine-tunes in parallel each specialized model on the target task, and then averages the weights of all target fine-tunings into a final model. Empirically, we show that model recycling maximizes model diversity by benefiting from diverse auxiliary tasks, and achieves a new state of the art on the reference DomainBed benchmark for out-of-distribution generalization. Looking forward, model recycling is a contribution to the emerging paradigm of updatable machine learning where, akin to open-source software development, the community collaborates to incrementally and reliably update machine learning models.
translated by 谷歌翻译
Does the dominant approach to learn representations (as a side effect of optimizing an expected cost for a single training distribution) remain a good approach when we are dealing with multiple distributions. Our thesis is that such scenarios are better served by representations that are "richer" than those obtained with a single optimization episode. This is supported by a collection of empirical results obtained with an apparently na\"ive ensembling technique: concatenating the representations obtained with multiple training episodes using the same data, model, algorithm, and hyper-parameters, but different random seeds. These independently trained networks perform similarly. Yet, in a number of scenarios involving new distributions, the concatenated representation performs substantially better than an equivalently sized network trained from scratch. This proves that the representations constructed by multiple training episodes are in fact different. Although their concatenation carries little additional information about the training task under the training distribution, it becomes substantially more informative when tasks or distributions change. Meanwhile, a single training episode is unlikely to yield such a redundant representation because the optimization process has no reason to accumulate features that do not incrementally improve the training performance.
translated by 谷歌翻译
Reinforcement learning often suffer from the sparse reward issue in real-world robotics problems. Learning from demonstration (LfD) is an effective way to eliminate this problem, which leverages collected expert data to aid online learning. Prior works often assume that the learning agent and the expert aim to accomplish the same task, which requires collecting new data for every new task. In this paper, we consider the case where the target task is mismatched from but similar with that of the expert. Such setting can be challenging and we found existing LfD methods can not effectively guide learning in mismatched new tasks with sparse rewards. We propose conservative reward shaping from demonstration (CRSfD), which shapes the sparse rewards using estimated expert value function. To accelerate learning processes, CRSfD guides the agent to conservatively explore around demonstrations. Experimental results of robot manipulation tasks show that our approach outperforms baseline LfD methods when transferring demonstrations collected in a single task to other different but similar tasks.
translated by 谷歌翻译
Nonnegative Tucker Factorization (NTF) minimizes the euclidean distance or Kullback-Leibler divergence between the original data and its low-rank approximation which often suffers from grossly corruptions or outliers and the neglect of manifold structures of data. In particular, NTF suffers from rotational ambiguity, whose solutions with and without rotation transformations are equally in the sense of yielding the maximum likelihood. In this paper, we propose three Robust Manifold NTF algorithms to handle outliers by incorporating structural knowledge about the outliers. They first applies a half-quadratic optimization algorithm to transform the problem into a general weighted NTF where the weights are influenced by the outliers. Then, we introduce the correntropy induced metric, Huber function and Cauchy function for weights respectively, to handle the outliers. Finally, we introduce a manifold regularization to overcome the rotational ambiguity of NTF. We have compared the proposed method with a number of representative references covering major branches of NTF on a variety of real-world image databases. Experimental results illustrate the effectiveness of the proposed method under two evaluation metrics (accuracy and nmi).
translated by 谷歌翻译
End-to-end autonomous driving provides a feasible way to automatically maximize overall driving system performance by directly mapping the raw pixels from a front-facing camera to control signals. Recent advanced methods construct a latent world model to map the high dimensional observations into compact latent space. However, the latent states embedded by the world model proposed in previous works may contain a large amount of task-irrelevant information, resulting in low sampling efficiency and poor robustness to input perturbations. Meanwhile, the training data distribution is usually unbalanced, and the learned policy is hard to cope with the corner cases during the driving process. To solve the above challenges, we present a semantic masked recurrent world model (SEM2), which introduces a latent filter to extract key task-relevant features and reconstruct a semantic mask via the filtered features, and is trained with a multi-source data sampler, which aggregates common data and multiple corner case data in a single batch, to balance the data distribution. Extensive experiments on CARLA show that our method outperforms the state-of-the-art approaches in terms of sample efficiency and robustness to input permutations.
translated by 谷歌翻译
我们研究在线动态定价的问题,具有两种类型的公平限制:“程序公平性”,要求拟议的价格在不同群体之间的预期等同于期望,而“实质性公平”要求公认的价格要求公认的价格在预期中保持平等在不同的群体中。同时进行程序和实质性公平的政策称为“双重公平”。我们表明,双重公平的政策必须是随机的,才能获得比将相同价格分配给不同群体的最佳琐碎政策更高的收入。在两组设置中,我们为达到$ \ tilde {o}(\ sqrt {t})$遗憾的两组定价问题提供了在线学习算法,零过程不公平和$ \ tilde {o}(\ sqrt {t})$对$ t $回合学习的实质性不公平。我们还证明了两个下限,表明这些结果是遗憾和不公平性的,这两者在理论上都是最佳的,直到迭代的对数因素。据我们所知,这是第一个学会定价的动态定价算法,同时满足了两个公平的约束。
translated by 谷歌翻译
事件摄像头是一种新兴的生物启发的视觉传感器,每像素亮度不同步地变化。它具有高动态范围,高速响应和低功率预算的明显优势,使其能够在不受控制的环境中最好地捕获本地动作。这激发了我们释放事件摄像机进行人姿势估计的潜力,因为很少探索人类姿势估计。但是,由于新型范式从传统的基于框架的摄像机转变,时间间隔中的事件信号包含非常有限的信息,因为事件摄像机只能捕获移动的身体部位并忽略那些静态的身体部位,从而导致某些部位不完整甚至在时间间隔中消失。本文提出了一种新型的密集连接的复发架构,以解决不完整信息的问题。通过这种经常性的体系结构,我们可以明确地对跨时间步骤的顺序几何一致性进行明确模拟,从而从以前的帧中积累信息以恢复整个人体,从而从事件数据中获得稳定且准确的人类姿势估计。此外,为了更好地评估我们的模型,我们收集了一个基于人类姿势注释的大型多模式事件数据集,该数据集是迄今为止我们所知的最具挑战性的数据集。两个公共数据集和我们自己的数据集的实验结果证明了我们方法的有效性和强度。代码可以在线提供,以促进未来的研究。
translated by 谷歌翻译
为设计控制器选择适当的参数集对于最终性能至关重要,但通常需要一个乏味而仔细的调整过程,这意味着强烈需要自动调整方法。但是,在现有方法中,无衍生物的可扩展性或效率低下,而基于梯度的方法可能由于可能是非差异的控制器结构而无法使用。为了解决问题,我们使用新颖的无衍生化强化学习(RL)框架来解决控制器调整问题,该框架在经验收集过程中在参数空间中执行时间段的扰动,并将无衍生策略更新集成到高级参与者 - 批判性RL中实现高多功能性和效率的体系结构。为了证明该框架的功效,我们在自动驾驶的两个具体示例上进行数值实验,即使用PID控制器和MPC控制器进行轨迹跟踪的自适应巡航控制。实验结果表明,所提出的方法的表现优于流行的基线,并突出了其强大的控制器调整潜力。
translated by 谷歌翻译